CAATTCCCCTCCAATCATTCTCTCTTCGCTATCCATTCACAAACCGAAACTTCTCATCCA 
1 + ♦ ^ ♦ ^ 60 

TAACAAGTCATTCAAACACGCCCAAGATGTCTATGCGTATAACACAATATTTGGACAAAT 

CCTCCCAAAACTCGAAATTCTCACCCATAAAATCATTAACTTCAACCGCCTAATGTAACT 

TATCTCATCTTTCTACAATTAAAAAAATTCTTTTTTTTTCCAAATTAATTTTCCAACATT 
181 + — -—-—-I— — — — — — -f- — — — ^ 240 

AACCAAAAACC ATTAAAAATCAATAAAACCCAATAAACAGCCCTTCC C TII C I I II l A AT 

TTAAATTATAATTTTTCTCATTGTTGTATCAACCTACAAAATGTACrCTrrTTCTArrTG 
301 + + + 4 ^ + 360 

AATATTGTATTACACCCTTGCGATTCTCCCCAAATATCACCGACACTCCAAGATTXACAA 
361 + + + + + + 420 

CAACGACGTGTGACAATCACTAAGTCAAAGAGCGAAACGATAAACGATTGTCATATTTCA 
421 — — + + . — + — + + 480 

CTCTTTTACTCATTCCCTTTTTAAATAACAACTATATCCCGATTrGCCCATATATTTTTG 

481 ♦ + + + ^ 540 

TTTATTACCCCTCTCACATTCCTGTACAATGTTTCTACCAAATAAACTCCATTTTTATCT 

GAAAATTCGAATTTATTTTTCTCTACTTTTTACTCGTTGCATTCCACATCAGCATATCTT 
601 + + + + — ^ + 660 

CCGCTCTATTTATATTCAACGATTTTTATAAATTACTACTCCTTCATGTTTAATTTCATT 
661 + + + + + + 720 

TTATCTGTAACCTTTACTGTATTTTTTTAAAATCTTTCTTGCTTCTATCTCATTATACAA 
721 + + + + + 4 780 

TCTTCTTTACTCATTTTCAACGTATTTTTATGCCTCACAATTTATCCACATTTCGCGCTT 
781 + -f + + + + 840 

CGACATTTATCCTCTATATTACATGCCTGTTTTTTTAAACGATATAATGTTTAACAAATA 
841 + ^ + + + + + 900 

ATTTTTTATCAATCCTATTGTATATTCTCCAGCTAACCGTTGTTTCGAAAACATCACCTA 
901 V— + + + + + + 960 

ccatttAaaattcacaaaatcttgcttccttataatcaagaacatttttcagatgctct 

961 + + + + + + 1020 

M L C 

Xgaaatcgaatccccccctttgagcacggcacacacgacgctcatccacgac^tgaac 

1021 + + + ♦ + 1080 

ciecralstahtrlirofep 

>r ,10 20 . 

T nlX€2 

cacgtgacgcattgacttatttagaagccaaaaacattttcacacaacatcattctgaac 

1081 + + + + + ^ + 1140 

ROALTYLEGKNIFTEDHSEL 
30 40 
TTATCACTAAAATCTCAACTCCCCTCGAGACGAT.CGCCAATTTTCTTCCAATCTATCGAC 

1141 + — + + — + + + 1200 

Z5KMSTRLCRIAHF LRXYRR 
SO 60 
CTCAAGCTTCTCAACTTCCACCACTCATCGACTTTTTCAACTACAACAATCAAAGTCACC 

1201 + + + + 1260 

QASELGPLIOFFNYNMQSflL 
70 80 
TTGCTGATTTCCTCGAAGACTACATCCATTTTCCGATAAATGACCCACATCTACTTCCTC 

1261 + + + + + + 1320 

AO FLEDYIOFAIMEPDLLRP 
90 100 
CACTAGTCATTCCTCCACAATrrTCCCGACAAATCCTCCATACCAAACTATTCCTTGCGA 

1321 * f + + + + 1380 

VVIAPQFSRQMLDRKLLLCN 
110 120 
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c 




T n2274 

ATCTTCCAAAACAAATGACATGCTXTATTCGACACTATCACCTCCATicACTCATCAAAA 

1381 + ♦ + ♦ + 1440 

VPKQMTCYIREYHVORVIKK 
130 140 
I Intron 1 

ACCTCCACCAGATCrrGTCATTTAGpTGACAAAACTCCAACCTCTCGTCTTTATTATAATC 
LOEMCDtD 



I 150 

ZAckc 



TTGCTTAAACTTCAGACTCCTTTTTTCTCTTTCTACACGCCCCACCTGGATCCGCAAAAT 

1501 + -I >~ * + 1560 

SFPLFLHGRACSG.KS 

160 

ilncron 2 
TGAGTGCTAT 

1561 4- + + ♦ + ♦ 1620 

VIASQALSKSOQLZGI 
170 I IBO 

TATCTGAATCTACGGATCTTCATTCTATTACAaAAATTATGATTCAATCCTTTGCCTCAA 

1621 + +— — + ^ + + + 1680 

NYOSIVHLK 

190 

AGATAGTGSAACACCTCCAAAATCTACATTCCATTTATTTACGGATATTTTCCTCATGCT 
1681 + + + + -f— + 1740 

DSGTAPK STFDLFTDI LLML 

200 210 

A nl920/n2247 
rt Intron 3- 

AA^STGAGTGAATAGAGTGCATGTAACATTCAGCATGATTTTGAAATTATGAAAATTTGA 

1741 + + + + • ♦ + 1800 

K 

CCTGGTTAGCTTTTAATTTCATATTTCGTGACGCTTGCATGTTTTGTGTGTTTCAAGACG 
1801 — + + + + + 1860 

AGCCCGTGTTGTGAGCGACACGGATGACTCGCATTCGATCACCGACTTCATTAACCGTGT 

1861 + + + -t^ + 1920 

A| n2273 

TCTTTCAAGkAGCGAAGACGATCTTCTCAATTTCCCATCGGTGGAGCATGTCACCTCAGT 

1921 + -i + + + 1980 

SEDDLLNFP SVEHVTSV 

220 

I Intron 4 

TCTACTCAAAAGGATGpTAAGTTGCTTGCCGATTCTCGTACAATATCTTAAATTATTGGT 

V L K R .M 
230 I 

TTTTAGtvTCTCCAACGCACTCATTGATCGTCCAAATACTTTATTCGTATTTGATGACCTA 

2041 + — + -»- — + + + 2100 

ICNALIDRPNT LFVFDDV 
240 250 
A nl948 T nl947 

t t 

GTTCAACAAGAAACAATTCGTTCGGCrCAGGAGCTACGTCTTCCATGTCTTCTAACTACT 

VQEETIRWAQELRLRCLVTT 

260 270 
CGTGACGTCGAAATATCAAATGCTGCTTCTCAAACATGCGAATTCATTGAAGTGACATCA 
2161 + + + + + 2220 

ROVEISNAA SQTCEFIEVTS 

280 290 
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TTCCAAATCCATGAATGTTATGATTTTCTAGAACCTTATGGAATCCCGATCCCTCTTGGA 

LEIDECrOFLEAYGMPMPVC 

300 310 
Tc4 nl41€ 

caaaaagaacaacatgtgcttaataaaacaatcgaactaJccactcgaaatccaccaacg 
ekeedvl nktielssghpat 

320 330 

tlncron 5 

CTTATGATGTTTTTCAACTCTTGTCAACCGAAAACATTTGAAAAGfrCACTGCGACATACC 

LMMrFKSC EPKT FEK 

330 

AATTTGAGACTTTTAAAATAATTTATTCTACAATAAAACTTAATCAAAAACTTTCATACC 

TCATTCTCTTTAAATTTTACCAATTGACGATCAAAATCAAGAATTAGCATCCTGCCACGA 
24 61 ~ + ~+ > * * 2520 

GAGAAAACTGTCTACCTACCGTACCCCAGAGATTTTCTTGATArrTGCCATCCATTTAAT 
2521 +- + ^ ♦ + 2580 

TTTTTAAGAAAATTATCGTTTTACATAATTGAACAAGACATACACGGTCTCGACCCCACG 
2581 + + + + , ^ 264 0 

GAAATTTTTTAAATGAAACCGAGTATGAGCCTGTTTTCATTATTTTTCGATTTTCTCTTG 

TTGTTTCTTTTTATTTAAACCCTTTTATTTTGAAACAAGTCTAAAAATATTAAAAACTG A 
2701 + * + ^ 2760 

ATAAAATATTTAAAAAAAATCAAGTAAAATAGAAAAACAGCAAGGCTCCACACTACTGTA 
2761 + -f ^ + + 2820 

CTTCTTAAATCCGCATACTCTTTTTATTTAATCATTTTCCCGAATGTCGAAACCAAATAA 

T AC ATTTTT AGTCCAAAATCGCT ACGT ATATTCTT AAAATT ATCAAACATTTTGC ATTC A 

^ATGGCACAGCTTAATAACAAATTGGAAAGTCGAGGATTAGTCGGTGTTGAATCTATCA 
2941 + + * + + 3000 

MAQLNNKL ESRCLVGVECIT 
340 350 

CCCCTTACTCGTACAACTCACTCGCAATCCCTCTTCAAAGATGTGTTCAAGTTTTCTCAG 

PYSYKSLAMALQRCVEVLSD 
360 370 

ATGACGATCGAAGTGCTCTTGCTTTCGCAGTTGTGATCCCTCCTGGAGTTGATATACCCG 

3061 + + + + + 3120 

E D R S.A LAF A VVM P P G VD I P V 
380 390 
A n2894 

\ 

tcaagctatggtcatgtgttattccagttcatatttcttcaaatgaagaacaacaattcg 
kl'hscvipvd icsneeeqld 

400 410 

I Intron 6 

ATGATGAACTTGCGGATCGGTTGAAAAGACTCACCAMSTATGACTCTTGAAATTTGAAGA 

3181 — + — -f — *. + 324 0 

DEVADRLKRL5K 
420 I 
TTTAAATTAACACTTAAAArrTCAGlACGTGGACCTCTTCTCAGTGGAAAACGAATGCCCG 

RGALLSGXRHPV 
430 440 
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TTTTCACATTCXAAATTCATCATATTATCCATATCTTCTTCAAACACCTCCTTCATCCAC 

LTFKIOHIIHMFLKHVVDAQ 

450 460 
I Incron 7 I 
AAACTATCCCCpTATCCTCAAAATGTCTCAACTTTCAATTAAATTTTAAATTTTCACRAT 

3361 + ^ ^ 4- + 3420 

T I A N 
CCAATCTCAATTCTCCACCACCCTCTTCTTCAAATACGAAACAATAATCTATCACTACCC 

CISILEQRLI-EICNNN VSVP 

470 480 
CACCCACAIATACCATCACATTTCCAAAAATTCCCTCCTTCATCACCCACTCACATGTAT 
2AB1 3540 
ERHIPSHFQKFRRSSASEMY 

500 510 
CCAAAAACTACACAACAAACTCTCATCCGTCCrCAACACTrCCCAAAGTTCATCCAATTG 

3541 + + * + + * 3600 

PKTTEETVIRPEDFPKFMQL 

520 530 
CACCAGAAATTCTATGACTCCCTCAAAAATTTTGCATCCTCTTAAAACCTATCGTGTACA 

H QKF.YDSLKNFACC^ 
540 

ATATTGCCTGTATATTCCCCTCGAAATACCTTTATACTTTTTCGCACGAGTTTTCTCATT 

3661 + -f + + + * 3720 

TTTTCATTTGTACTTCTTTTATTTCTCTCCAAAATTTCAGATCTATCCCAAATGTTCTTA 

3721 + * + + + * 3780 

AATTTAATCTTTTCTACAGATACTCAACACATCTTGTTTCATCTCATCCTTGCTTTTTTT 

3781 + > + + * * 3840 

TTtCAAATATATTCACTTTCTTTTATAATTTTAATTAATCGAATTAATACATTCACGTAA 

3841 + + +- + + ♦ 3900 

AGAATTTCCIGGACTATTArrTTATCGCATCCAAATGATTTATTCCCTATTCTTCGAAAC 

3901 -¥ ♦ +— — + — + + 3960 

TTCCAAATTCATCATTTTTAAACACGCCrCATTAAATTCAAAGTCCTACTTTTAGTCTCG 

3961 + + + * + 4020 

AACATGAAGTAAGTTATTXTCTGTCTTCTAAATTCAAAGTCCATTCCAAAAGGACATTTG 

4021 — 4> — ' — + + + + 4080 

ATCAGTTTTCACGAAAACCGTAATXTTTACAATTTCCTTTCAGTTTTGAAGATGTTCCAT 

4081 + ^ + + + 4140 

TTCTTTCCTCTCTTCGCGTCATTACTACATTTGCTTTGCTCCTTCACTTTATCCACATTC 

4141 + ' + + + * 4200 

TTGCCATCAATCCAGTTCCATCTAGACCGATAGCAGTCTTCATATCATTATCCCTGTATA 

4201 + ♦ + + 4260 

TTGTACTGTTTCAGTATTTTAACITATCGATTACGTACrATATTCAGTGGTTCACTCTTT 

4261 * + + + + 43^0 

TCCCTCAATGCGTCACACGTCCTCCACCANNAAT.TTTCAACGAACCCAATCTCCTACTCA 

CTTATCAACCAACAGCCCTCACCCATG 
4381 > 4407 
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121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 

1321 

1381 

1441 



ced'3 Genomic Sequence 



AGATCTGAAATAAGGTGATAAATTAATAAATTAAGTGTATTTCTGAGGAAATTTGACTGT 

1 + + ^ ^ ^ ^ 

TTTAGCACAATTAATCTTGTTTCAGAAAAAAAGTCCAGTTTTCTAGATTTTTCCGTCTTA 
61 + + ^ ^ ^ ^20 

TTGTCGAATTAATATCCCTATTATCACTTTTTCATGCTCATCCTCGAGCGGCACGTCCTC 



AAAGAATTGTGAGAGCAAACGCGCTCCCATTGACCTCCACACTCAGCCGCCAAAACAAAC 



AACATATTTGACGGCAAAATATCTCGTAGCGAAAACTACAGTAATTCTTTAAATGACTAC 
+ + + + + ^ 

Repeat. 1 

„ — — ^ — _„„ <— — — 

TGTAGCGCTTGTGTCGATTTACGGGCTCAATTTTTGAAAATAATTTTTTTTTTCGAATTT 



180 
240 



GTTCGAACATTCGTGTGTTGTGCTCCTTTTCCGTTATCTTGCAGTCATCTTTTGTCGTTT 
+ + -+ + + + 300 

TTTTCTTTGTTCTTTTTGTTGAACGTGTTGCTAAGCAATTATTACATCAATTGAAGAAAA 
+ ^ ^ ^ ^ ^ 

GGCTCGCCGATTTATTGTTGCCAGAAAGATTCTGAGATTCTCGAAGTCGATTTTATAATA 

+ + + ^ ^ ^ 0 

TTTAACCTTGGTTTTTGCATTGTTTCGTTTAAAAAAACCACTGTTTATGTGAAAAACGAT 
+ + ^ ^ ^ 

TAGTTTACTAATAAAACTACTTTTAAACCTTTACCTTTACCTCACCGCTCCGTGTTCATG 
+ + + ^ ^ ^ 

GCTCATAGATTTTCGATACTCAAATCCAAAAATAAATTTACGAGGGCAATTAATGTGAAA 
. — + + + ^ ^ ^ 

CAAAAACAATCCTAAGATTTCCACATGTTTGACCTCTCCGGCACCTTCTTCCTTAGCCCC 
+ + ^. ^ ^ 

ACCACTCCATCACCTCTTTGGCGGTGTTCTTCGAAACCCACTTAGGAAAGCAGTGTGTAT 

+ _ + + ^ ^ 0 

CTCATTTGGTATGCTCTTTTCGATTTTATAGCTCTTTGTCGCAATTTCAATGCTTTAAAC 
+ + + ^ ^ ^ 

AATCCAAATCGCATTATATTTGTGCATGGAGGCAAATGACGGGGTTGGAATCTTAGATGA 

+ ^ ^ ^ ^ _^ ^ 

GATCAGGAGCTTTCAGGGTAAACGCCCGGTTCATTTTGTACCACATTTCATCATTTTCCT 
+ _ + ^ ^ ^ 

GTCGTCCTTGGTATCCTCAACTTGTCCCGGTTTTGTTTTCGGTACACTCTTCCGTGATGC 
+ ^ ^ ^ ^ 

CACCTGTCTCCGTCTCAATTATCGTTTAGAAATGTGAACTGTCCAGATGGGTGACTCATA 

+ + + ^ ^ ^ ^^20 

TTGCTGCTGCTACAATCCACTTTCTTTTCTCATCGGCAGTCTTACGAGCCCATCATAAAC 

+ ^ ^ ^ ^ ^^g^ 

TTTTTTTTCCGCGAAATTTGCAATAAACCGGCCAAAAACTTTCTCCAAATTGTTACGCAA 

+ + ^ ^ ^ ^ ^^^^ 

TATATACAATCCATAAGAATATCTTCTCAATGTTTATGATTTCTTCGCAGCACTTTCTCT 

+ t' r + + + + ^ 12 00 

TCGTGTGCTAACATCTTATTTTTATAATATTTCCGCTAAAATTCCGATTTTTGAGTATTA 

-+ + ^ ^ ^ ^ ^2 60 

ATTTATCGTAAAATTATCATAATAGCACCGAAAACTACTAAAAATGGTAAAAGCTCCTTT 

+ + + ^ ^ ^ ^^^^ 

Repeat 1 



TAAATCGGCTCGACATTATCGTATTAAGGAATCACAAAATTCTGAGAATGCGTACTGCGC 

+ + ^ ^ ^ ^^^^ 



1440 



1500 
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TGATAACCCGTAAATCGTCACAACGCTACAGTAGTCATTTAAAGGATTACTGTAGTTCTA 
1501 + + + + ^ ^ 



GCTACGAGATATTTTGCGCGCCAAATATGACTGTAATACGCATTCTCTGAATTTTGTGTT 
1561 + + + + ^ ^ ^g20 

TCCGTAATAATTTCACAAGATTTTGGCATTCCACTTTAAAGGCGCACAGGATTTATTCCA 
1621 +— + + + ^ ^ ^g3Q 

ATGGGTCTCGGCACGCAAAAAGTTTGATAGACTTTTAAATTCTCCTTGCATTTTTAATTC 
1681 + + + ^ ^ ^^^Q 

AATTACTAAAATTTTCGTGAATTTTTCTGTTAAAATTTTTAAAATCAGTTTTCTAATATT 
1741 + + + + ^ ^ ^QQQ 

TTCCAGGCTGACAAACAGAAACAAAAACACAACAAACATTTTAAAAATCAGTTTTCAAAT 
1801 + + + + ^ ^ ^QgQ 

TAAAAATAACGATTTCTCATTGAAAATTGTGTTTTATGTTTGCGAAAATAAAAGAGAACT 
1861 + + + ^ ^ ^ 1920 

GATTCAAAACAATTTTAACAAAAAAAAACCCCAAAATTCGCCAGAAATCAAGATAAAAAA 
1921 + — + + + ^ ^^3Q 

TTCAAGAGGGTCAAAATTTTCCGATTTTACTGACTTTCACCTTTTTTTTCGTAGTTCAGT 
1981 — + + + + ^ ^ 2040 

GCAGTTGTTGGAGTTTTTGACGAAAACTAGGAAAAAAATCGATAAAAATTACTCAAATCG 
2041 +- + + + + ^ 2100 

AGCTGAATTTTGAGGACAATGTTTAAAAAAAAACACTATTTTTCCAATAATTTCACTCAT 
2101 +- + + ^ ^ ^ 2160 

TTTCAGACTAAATCGAAAATCAAATCGTACTCTGACTACGGGTCAGTAGAGAGGTCAACC 
2161 . + ^ ^ ^ ^ 2220 

ATCAGCCGAAGATGATGCGTCAAGATAGAAGGAGCTTGCTAGAGAGGAACATTATGATGT 

2221 + + + X X _L 

-p + + + + 2280 

MMRQDRRSLLERNIM MF 
1 10 
T (nl040) 
I 

TCTCTAGTCATCTAAAAGTCGATGAAATTCTCGAAGTTCTCATCGCAAAACAAGTGTTGA 
2281 + + + + ^ ^ 2340 

SSHLKVDEI LEVLIAKQVLN 
20 30 

/ I intron 1 

ATAGTGATAATGGAGATATGATTAATGTGAGTTTTTAATCGAATAATAATTTTAAAAAAA 
2341 + + + + ^ ^ 2400 

SDNGDMIK 
40 

I 

AATTGATAATATAAAGAATATTTTTGCAGTCATGTGGAACGGTTCGCGAGAAGAGACGGG 
2401 + + „^ ^ ^ ^ 2460- 

SCGTVREKRRE 
50 

A(n718) 
I 

AGATCGTGAAAGCAGTGCAACGACGGGGAGATGTGGCGTTCGACGCGTTTTATGATGCTC 
2461 + + + + ^ ^ 

IVKAVQRRGDVAFDAFYDAL 
60 70 

ttcgctctacgggacacgaaggacttgctgaagttcttgaacctctcgccagatcgtagg" ^ 
rstgheglaevleplars ^^^^ 

80 90 
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TTTTTAAAGTTCGGCGCAAAAGCAAGGGTCTCACGGAAAAAAGAGGCGGATCGTAATTTT 
2581 + + + + + + 2 640 

GCAACCCACCGGCACGGTTTTTTCCTCCGAAAATCGGAAATTATGCACTTTCCCAAATAT 
* 2641 + + + + + + 2700 

TTGAAGTGAAATATATTTTATTTACTGAAAGCTCGAGTGATTATTTATTTTTTAACACTA 
2701 + + + + + + 2760 

ATTTTCGTGGCGCAAAAGGCCATTTTGTAGATTTGCCGAAAATACTTGTCACACACACAC 
2761 + + + + ' — + + 2820 

I 

ACACACATCTCCTTCAAATATCCCTTTTTCCAGTGTTGACTCGAATGCTGTCGAATTCGA 

2821 + -+ +- + + + 2880 

VDSNAVEFE 
100 

GTGTCCAATGTCACCGGCAAGCCATCGTCGGAGCCGCGCATTGAGCCCCGCCGGCTACAC 

2881 + + + + + + 2940 

CPMSPASHRRSRALSPAGYT 
110 120 

TTCACCGACCCGAGTTCACCGTGACAGCGTCTCTTCAGTGTCATCATTCACTTCTTATCA 

2941 _ + + + + + + 3000 

S*P TRVHRDSVSSVSSFTSYQ 
130 140 

GGATATCTACTCAAGAGCAAGATCTCGTTCTCGATCGCGTGCACTTCATTCATCGGATCG 

3001 + + + + + — + 3060 

DIYSRARS RSRSRALHSSDR 
150 160 

I intron 3 ^ 
ACACAATTATTCATCTCCTCCAGTCAACGCATTTCCCAGCCAACCTTGTATGTTGATGCG 

3061 ' +-- + + + + + 3120 

HNYSSPPV NAFPSQPS 
170 

Repeat, 1 



AACACTAAATTCTGAGAATGCGCATTACTCAACATATTTGACGCGCAAATATCTCGTAGC 
3121 + + + + + + 3180 



GAAAAATACAGTAACCCTTTAAATGACTATTGTAGTGTCGATTTACGGGCTCGATTTTCG 
3181 + + + + + + 3240 

AAACGAATATATGCTCGAATTGTGACAACGAATTTTAATTTGTCATTTTTGTGTTTTCTT 
3241 + + + + • + + 3300 

Repeat 1 

TTGATATTTTTGATCAATTAATAAATTATTTCCGTAAACAGACACCAGCGCTACAGTACT 
3301 + + + + + + 3360 



CTTTTAAAGAGTTACAGTAGTTTTCGCTTCAAGATATTTTGAAAAGAATTTTAAACATTT 
3361 + + + + + + 3420 

TGAAAAAAAATCATCTAACATGTGCCAAAACGCTTTTTTCAAGTTTCGCAGATTTTTTGA 
3421 + + — + + + + 3480 
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SI 



Repeat. 2 

TTTTTTTCATTCAAGATATGCTTATTAACACATATAATTATCATTAATGTGAATTTCTTG 
34 81 + + + + + + 354 0 



TAGAAATTTTGGGCTTTTCGTTCTAGTATGCTCTACTTTTGAAATTGCTCAACGAAAAAA 
3541 + + + + + 3600 



TCATGTGGTTTGTTCATATGAATGACGAAAAATAGCAATTTTTTATATATTTTCCCCTAT 
3601 + + + +„ + + 3660 



TCATGTTGTGCAGAAAAATAGTAAAAAAGCGCATGCATTTTTCGACATTTTTTACATCGA 
3661 + + + + + + 3720 

ACGACAGCTCACTTCACATGCTGAAGACGAGAGACGCGGAGAAATACCACACATCTTTCT 
3721 ' + +~ + + + + 37Q0 

Repeat 2 

<-———-—--————————————„„„.«. «„„_«^.__ 

GCGTCTCTCGTCTTCAGCATGTGAAATGGGATCTCGGTCGATGTAAAAAAATGTCGAATA 
3781 + + . T + + + 3840 



ATGTAAAAAATGCATGCGTTTTTTTACACTTTTCTGCACAAATGAATAGGGGGAAAATGT 
3641 + +- + + + + 3500 



ATTAAAATACATTTTTTGTATTTTTCAACATCACATGATTAACCCCATTATTTTTTCGTT 
3901 + + + + ^ ^ 35gQ 



GAGCAACTTAAAAAGTAGAGAATATTAGAGCGAAAACCAAAATTTCTTCAAGATATTACC 
3961 + + + +„. + + 4020 



TTTATTGATAATTATAGATGTTAATAAGCATATCTTGAATGAAAGTCAGCAAAAATATGT 
4021 + + + + + ^ 4080^ 

GCGAAACACCTGAAAAAAATCAAAAATTCTGCGAAAATTGAAAAAATGCATTAAAATACA 
4081 + + + + + ^ 

TTTTTGCATTTTTCTACATCACATGAATGTAGAAAATTAAAAGGGAAATCAAAATTTCTA 
4141 ■• + + + + + ^ 4200 

GAGGATATAATTGAATGAAACATTGCGAAATTAAAATGTGCGAAACGTCAAAAAAGAGGA 
4201 + + + + ^ ^ 4260 

I 

AATTTGGGTATCAAAATCGATCCTAAAACCAACACATTTCAGCATCCGCCAACTCTTCAT 
4261 + + + + + ^ 4320 

SANS ST 
180 
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TCACCGGATGCTCTTCTCTCGGATACAGTTCAAGTCGTAATCGCTCATTCAGCAAAGCTT 
4321 + + + ^ ^ ^ ^3QQ 

TGCSSLGYSSSRNRSFSKAS 
190 200 

CTGGACCAACTCAATACATATTCCATGAAGAGGATATGAACTTTGTCGATGCACCAACCA 
4381 + ^ + ^ ^ ^ ^^^^ 

GPTQYIFHEEDMNFVDAPTI 
210 220 

TAAGCCGTGTTTTCGACGAGAAAACCATGTACAGAAACTTCTCGAGTCCTCGTGGAATGT 
4441 + 4„ ^ ^ ^ ^ ^^^^ 

SRVFDEKT MYRNFSSPRG MC 
230 240 

GCCTCATCATAAATAATGAACACTTTGAGCAGATGCCAACACGGAATGGTACCAAGGCCG 
4501 + + + + _ + ^ 

LI INNEHFEQMPTRNGTKAD 
250 260 

ACAAGGACAATCTTACCAATTTGTTCAGATGCATGGGCTATACGGTTATTTGCAAGGACA 
4561 + + + + ^ ^ 4g20 

KDNLT NLFRCMGYTVICKDN 
270 280 

I intron 4 

ATCTGACGGGAAGGGTACGGCGAAATTATATTACCCAAACGCGAAATTTGCCATTTTGCG 
4 621 + + + + ^ ^ 4gg0 

L T G R 

Repeat 3 

■"""---"----------------------> 

CCGAAAATGTGGCGCCCGGTCTCGACACGACAATTTGTGTTAAATGCAAAAATGTATAAT 
4681 + + + + ^ 

TTTGCAAAAAACAAAATTTTGAACTTCCGCGAAAATGATTTACCTAGTTTCGAAATTTTC 
4741 + + + + + ^ 

GTTTTTTCCGGCTACATTATGTGTTTTTTCTTAGTTTTTCTATAATATTTGATGTAAAAA 
4801 + + + + ^ ^ 4QgQ 

ACCGTTTGTAAATTTTCAGACAATTTTCCGCATACAAAACTTGATAGCACGAAATCAATT 
4861 + + + _ ^ ^ 4^20 

TTCTGAATTTTCAAAATTATCCAAAAATGCACAATTTAAAATTTGTGAAAATTGGCAAAC 
4921 — +— — +- + + ^ ^ 45gQ 

GGTGTTTCAATATGAAATGTATTTTTAAAAACTTTAAAAACCACTCCGGAAAAGCAATAA 
4981 + + + + ^ ^ 

AAATCAAAACAACGTCACAATTCAAATTCAAAAGTTATTCATCCGATTTGTTTATTTTTG 

5041 + + + + . ^ e,«« 

^ ^— + 5100 

CAAAATTTGAAAAAATCATGAAGGATTTAGAAAAGTTTTATAACATTTTTTCTAGATTTT 
5101 + + + + + ^ ^^gQ 

TCAAAATTTTTTTTAACAAATCGAGAAAAAGAGAATGAAAAATCGATTTTAAAAATATCC 
5161 + + + + + ^ 3220 

Repeat 3 

ACAGCTTCGAGAGTTTGAAATTACAGTACTCCTTAAAGGCGCACACCCCATTTGCATTGG 

5221 + + + + . ^ 

-r -r — + + 5280 



ACCAAAAATTTGTCGTGTCGAGACCAGGTACCGTAGTTTTTGTCGCAAAAATTGCACCAT 
TGGACAATAAACCTTCCTAATCACCAAAAAGTAAAATTGAAATCTTCGAAAAGCCAAAAA 



5341 ^ -T- T f +_ ^ 

T -c — + + 5 4 00 
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m 



ATTCAAAAAAAAAGTCGAATTTCGATTTTTTTTTTGGTTTTTTGGTCCCAAAAACCAAAA 
5401 + + + + + + 5460 

AAATCAATTTTCTGCAAAATACCAAAAAGAAACCCGAAAAAATTTCCCAGCCTTGTTCCT 

54 61 + + + + + + 5520 

I 

AATGTAAACTGATATTTAATTTCCAGGGAATGCTCCTGACAATTCGAGACTTTGCCAAAC 
5521 + + +— + + + 5580 

GMLLTIRDFAKH 
290 300 

ACGAATCACACGGAGATTCTGCGATACTCGTGATTCTATCACACGGAGAAGAGAATGTGA ' 
5581 + + + + + + 5640 

ESHGD.SAILVILSHG EENVI 

310 320 

TTATTGGAGTTGATGATATACCGATTAGTACACACGAGATATATGATCTTCTCAACGCGG 
5641 + + + + + + 5700 

IGVDDIPISTHEIYDLLNAA 

330 340 

A{n2433) 

- 11 intron 
CAAATGCTCCCCGTCTGGCGAATAAGCCGAAAATCGTTTTTGTGCAGGCTTGTCGAGGCG 
5101 + + + + + 5760 

NAP. RLANKPKIVFVQACRGE 

350 360 

I 

GTTCGTTTTTTATTTTAATTTTAATATAAATATTTTAAATAAATTCATTTTCAGAACGTC 
5761 — + + + + — — + + 5Q20 

R, R 

GTGACAATGGATTCCCAGTCTTGGATTCTGTCGACGGAGTTCCTGCATTTCTTCGTCGTG 
5821 ~— — -f— — +— — — — ——f — — — I — — — _^ 588 0 

PNGFPVLDSVDGVPAFLRRG 
3*70 380 

T(nll65) 

GATGGGACAATCGAGACGGGCCATTGTTCAATTTTCTTGGATGTGTGCGGCCGCAAGTTC 

5881 — +---— + + + +— , + 594 0 

W -DNR DGPLFNFLGCVRPQ.VQ 
390 400 

! intron 6 

AGGTTGCAATTTAATTTCTTGAATGAGAATATTCCTTCAAAAAATCTAAAATAGATTTTT 
5941 + + + + + + gOOO 

ATTCCAGAAAGTCCCGATCGAAAAATTGCGATATAATTACGAAATTTGTGATAAAATGAC 
6001 -+ + + + + + gOgo" 

Repeat 4 



AAACCAATCAGCATCGTCGATCTCCGCCCACTTCATCGGATTGGTTTGAAAGTGGGCGGA 
6061 + + + + + ^ 

--—-------------> 

GTGAATTGCTGATTGGTCGCAGTTTTCAGTTTAGAGGGAATTTAAAAATCGCCTTTTCGA 

6121 + + + + + ^ gJ^QQ 

AAATTAAAAATTGATTTTTTCAATTTTTTCGAAAAATATTCCGATTATTTTATATTCTTT 
6181 + + + + + ^ ^ g24o 
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A(n717) 
I 

GGAGCGAAAGCCCCGTCCTGTAAACATTTTTAAATGATAATTAATAAATTTTTGCAGCAA 
6241 + + + + + g30o 

Q 

T(nl949) 
I 

GTGTGGAGAAAGAAGCCGAGCCAAGCTGACATTCTGATTCGATACGCAACGACAGCTCAA 
6301 + + + + + ^ g3gQ 

VWR KKPSQADILI RYATTAQ 
410 420 

A(nl286) 
I 

TATGTTTCGTGGAGAAACAGTGCTCGTGGATCATGGTTCATTCAAGCCGTCTGTGAAGTG 
6361 + + + + + ^ g^20 

YVSWRNSARGSWFI QAVCEV 
430 440 

T (nll29,nll64) 
I 

TTCTCGACACACGCAAAGGATATGGATGTTGTTGAGCTGCTGACTGAAGTCAATAAGAAG 

6421 + + ^ 6480 

FST HAKDMDVVELLTEVNKK 
450 460 

T<'^2430) A(n2426) 
' II intron 7 
GTCGCTTGTGGATTTCAGACATCACAGGGATCGAATATTTTGAAACAGATGCCAGAGGTA 
6481 + + + 4 + ^ ^6540 

VACGFQTSQGSNILKQMPE 
470 480. 

Repeat 5 

CTTGAAACAAACAATGCATGTCTAACTTTTAAGGACACAGAAAAATAGGCAGAGGCTCCT 
6541 • + + + + + ^ ggOQ 



TTTGCAAGCCTGCCGCGCGTCAACCTAGAATTTTAGTTTTTAGCTAAAATGATTGATTTT 

6601 +-.'— _+ + + ^ ^ gggQ 

GAATATTTTATGCTAATTTTTTTGCGTTAAATTTTGAAATAGTCACTATTTATCGGGTTT 

6661 + + + + ^ ^ g^20 

CCAGTAAAAAATGTTTATTAGCCATTGGATTTTACTGAAAACGAAAATTTGTAGTTTTTC 

6721 + + + + ^ ^ g,gQ 



AACGAAATTTATCGATTTTTAAATGTAAAAAAAAATAGCGAAAATTACATCAACCATCAA 
+ + + -f ^ ^ 

GCATTTAAGCCAAAATTGTTAACTCATTTAAAAATTAATTCAAAGTTGTCCACGAGTATT 



6841 . . . , ^ ^ 

Repeat 5 

ACACGGTTGGCGCGCGGCAAGTTTGCAAAACGACGCTCCGCCTCTTTTTCTGTGCGGCTT 

6901 + + ^ . ^ft^rt 

-r . + ^ 6960 

T(nll63) 

GAAAACAAGGGATCGGTTTAGATTTTTCCCCAAAATTTAAATTAAATTTCAGATGACATC 
6961 + + + 4 + 7020 

M T S 
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cm cm 



CCGCCTGCTCAAAAAGTTCTACTTTTGGCCGGAAGCACGAAACTCTGCCGTCTAAAATTC 

7021 + + + + + + 7080 

RLLKK FYFWPEARNSAV* 
490 500 

ACTCGTGATTCATTGCCCAATTGATAATTGTCTGTATCTTCTCCCCCAGTTCTCTTTCGC 
7081 + + + + + + 714 0 

CCAATTAGTTTAAAACCATGTGTATATTGTTATCCTATACTCATTTCACTTTATCATTCT 
7141 + + + + h — + 7200 

ATCATTTCTCTTCCCATTTTCACACATTTCCATTTCTCTACGATAATCTAAAATTATGAC 
7201 + + + + + '-^ 7260 

GTTTGTGTCTCGAACGCATAATAATTTTAATAACTCGTTTTGAATTTGATTAGTTGTTGT 
7261 + + + + + + 7320 

GCCCAGTATATATGTATGTACTATGCTTCTATCAACAAAATAGTTTCATAGATCATCACC 
7321 + + + + + + 7380 

CCAACCCCACCAACCTACCGTACCATATTCATTTTTGCCGGGAATCAATTTCGATTAATT 
7381 + + + + + + 7440 

TTAACCTATTTTTTCGCCACAAAAAATCTAATATTTGAATTAACGAATAGCATTCCCATC 
74 41 + + + + + + 7500 

TCTCCCGTGCCGGAATGCCTCCCGGCCTTTTAAAGTTCGGAACATTTGGCAATTATGTAT 
7501 + + -+ + + + 7560 

AAATTTGTAGGTCCCCCCCATCATTTCCCGCCCATCATCTCAAATTGCATTCTTTTTTCG 
7561 + + + + + + 7620 

CCGTGATATCCCGATTCTGGTCAGCAAAGATCT 
7621 '-^ + + 7653 
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Lines 

1 01 MMRQDRRSLLERNIMMFSSHLKVDEILEVLIAKQVLNSDNGDhaNSCGTV 50 

2 W_ LE. . .K.QA, L. .D V...,R.E 

3 7VS»S1-I..R. ►A 

1 51 REKRREIVKAVQRPGDVAFDAFYDALRSTGHEGLAEVLEPIARSVDSNAV 100 

2 .DNEK R..E D . . .ND . .D . .M. . S . P .P. 

3 

1 101 EFECPMSPASHRRSRALSPAGYTSPTRVHRDSVSSVSSFTS_YQDIYSRA 14 9 

2 PM S P .A I T,.,V,... 

3 S 

1 150 RSRSR_SRALHSSDRHNYSSPPVNAFPSQPSSANSSFTGCSSLGYSSSRN 198 

2 ..S..S.,P.Q M.AA_TS A 

3 T. . , , *P..T V. ,S_.S.Q. . .A S. T 

1 199 R^FSKASGPTQYIFHEEDMNFVDAPTISRVFDEKTMYRNFSSPRGMCLI 247 

2 . T.AQS Y H .L. . . 

3 ..Y...,AHS, Y H T,..L... 

1 248 INNEHFEQMPTRNGTKADKDNLTNLFRCMGYTVICKDNLTGRGMLLTIRD 297 

2 I E..S...S 

3 P. ... IS I.H M 

1 298 FAKHESHGDSAILVILSHGEENVIIGVDDIPISTHEIYDLLNAANAPRIA 347 • 

2 . GRNDM . . . ; VS VNV 

3 N.T VSVNV....X 

1 34 8 NKPKIVFVQACRGERRDNGFPVLDSVDGVPAFLRRGWDNRDGPLFNFLGC 397 

2 L SLI 

3 L V LI KG 

1 398 VRPQVQQVWRKKPSQADILIRYATTAQYVSWRNSARGSWFIQAVCEVFST 447 

2 . . , , . .M. ,A L 

3 A...,/V A L 

1 448 HAKDMDWELLTEVNKKVACGFQTSQGSNILKQMPEMTSRLLKKFYFWPE 497 

2 , L 

3 A L 

1 4 98 ARN - SAV 503 

2 DRG. . . ■ . 

3 D . , RS . . . 



Line 1 C. elegans 
Line 2 c. briggsae 
Line 3 C. vulgaris 
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